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ABSTRACT 



This study investigated how the adoption of a constructivist 
model of teaching and learning and simple item analysis technicfues can be 
used to explore the instructor ' s pedagogical content knowledge in teaching 
elementary statistics . Descriptive data (percent of students responding to 
multiple-choice test options) are provided that support the case for specific 
student statistical learning problems on the following topics: calculation 
and interpretation of measures of central tendency and variability, 
understanding of reliability and validity, interpretation of correlation 
coefficients, estimation of correlation coefficients from graphic 
scatterplots, and the selection of the best test-retest reliability 
scenarios. It is suggested that item analysis findings from multiple-choice 
examinations can be used to discover student conceptual misunderstandings, 
improve classroom instruction, and refine test- item writing. An attached 
table graphically displays the 10 findings. (Author/ND) 
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Abstract 

This study explored how the adoption of a constructivist model of 
teaching and learning and simple item analysis techniques can be 
used to explore pedagogical content knowledge (PCK) when teaching 
elementary statistics. Descriptive data (% of students 
responding to multiple-choice test options) are provided that 
support the case for specific student statistical learning 
problems on the following topics: calculation and interpretation 

of measures of central tendency and variability, understanding of 
reliability and validity, interpretation of correlation 
coefficients, estimation of correlation coefficients from graphic 
scatterplots , and the selection of the best test-retest 
reliability scenarios. It is suggested that item analysis 
findings from multiple-choice examinations can be used to 
discover student conceptual misunderstandings, improve classroom 
instruction, and refine test-item writing. 
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Statistical Content Errors for 
Students in an Educational Psychology Course 

The structure of knowledge has long been thought of as an 
important key to understanding how students learn and how 
teachers can help students reason at higher levels of cognitive 
functioning . The cognitive revolution in psychology has 
rekindled interest in earlier forms of knowledge development such 
as; discovery learning, transfer of knowledge, and constructivist 
approaches to teaching and learning. 

Bruner (1960) advocated a deeper understanding of the 
structure of knowledge through discovery learning in specific 
content areas for improving comprehension, recall, transfer, and 
reasoning. When teachers can help students understand how 
knowledge is organized such an organizational framework allows 
students to advance beyond levels of simply absorbing facts and 
move toward understanding concepts and principles and applying 
what they have learned. Since the structure of knowledge is 
radically different in academic content areas, each discipline 
must assume the challenge of identifying such structural elements 
based upon specific content knowledge. 

Constructivist approaches to teaching and learning also 
assert that the structure makes a difference in learning and that 
students create their own knowledge through personal perceptual 
processes. Narode (1987) proposed that constructivist "concepts 
and their symbolic representations contain hidden epistemologies 
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which must be elucidated by education researchers and then 
communicated to educators and students" (p. 34). 

Although constructivism could be thought of a including two 
versions (developmental and sociocultural), these ideas are often 
directly linked to cognitive theorists such as Jean Piaget. 
DeVries (1997) recently argued that it is inaccurate to assume 
that Piaget's work only considered individualistic elements and 
outlined Piaget's lesser known social theory. As Airasian and 
Walsh (1997) reminded us, constructivism is not a theory which 
offers us an easy or direct instructional application in the 
classroom and the adoption of such a theoretical view leads to 
serious issues and problems that must be confronted by educators. 

The current interest in constructivism can also be seen as 
an integration of several different theoretical perspectives. 

For example, Herman (1995) has suggested that many of the goals 
of a constructivist approach to teaching and learning are very 
consistent with the humanistic education movement that rose to 
prominence in the 1970 's in terms of such concepts as freedom to 
learn, student-centered learning, facilitation of learning, 
search for personal meaning, and active involvement in learning. 

More recently, Shulman in an interview format (see Shulman & 
Sparks, 1992) suggested that teachers need to enhance their 
pedagogical content knowledge (P.C.K.) in order to strive toward 
excellence in teaching. The P.C.K. element emphasizes the 
importance of domain specific content in the teaching and 
learning process and proposes, for example, that teaching poetry 
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is likely to be very different from teaching mathematics. 

Shulman (1988) highlighted how an outstanding teacher's 
expertise can be distinguished from the knowledge of a subject 
matter specialist: 

The teacher not only understands the content to be learned 
and understands it deeply, but comprehends which aspects 
of the content are crucial for future understanding of the 
subject and which are more peripheral and are less likely 
to impede learning if not fully grasped. The teacher 
comprehends which aspects of the content will likely pose 
the greatest difficulties for the pupil's understanding. 

The most crucial to learn is not always the most difficult; 
the most difficult is not always the most crucial, (p. 37) 
The teaching of psychology at the college and university 
level could be advanced if professors thought of teaching as 
scholarship, critically examined data from their own courses, and 
shared their findings in research colloquiums which focused upon 
the sharing of psychological and pedagogical content knowledge. 
For example, what do we know about how students construct their 
knowledge of statistical concepts? What types of learning 
problems, misunderstandings, and points of confusion are they 
likely to encounter when learning statistics? 

Rarely do teaching professors even consider subjecting their 
own teaching to the scrutiny of research even though such an 
investigation could offer valuable insights related to how 
students learn about crucial psychological content such as 
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statistics. Becker (1996) reviewed some 500 sources in the 
current literature on statistics teaching and concluded that such 
resources yielded primarily anecdotal evidence and 
recommendations based upon the experiences and intuitions of 
instructors (only 30% of the articles were empirical studies). 

The present investigation was undertaken to explore some 
common errors, misunderstandings, and conceptual problems that 
were exhibited by students that were asked to apply basic 
statistical knowledge. Although students frequently experience 
considerable anxiety when learning statistics, very little 
research has systematically explored specific conceptual 
difficulties related to statistical knowledge which could be 
related to anxiety and poor student performance in this domain. 

Method 

Subjects 

A total of 101 undergraduate students enrolled in three 
distinct sections of an educational psychology course offered by 
a Department of Psychology at a small, state university campus in 
up-state New York served as subjects. The sample was composed of 
primarily female subjects (75%). 

Students over the two previous years had participated in a 
pilot study and helped the instructor field test and refine the 
course mastery materials and examinations. The investigator took 
detailed notes after classes and worked with students in focus 
groups to explore learning problems on the topic of statistics. 
Many students taking this course were preparing to become K-12 
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teachers and only 35% of students had previously taken a 
statistics course. 

Materials 

The classroom multiple-choice examination questions (four 
possible options per item) which dealt with statistical topics 
were subjected to item analysis techniques in order to further 
develop and test hypotheses related to student learning problems. 
Students were allowed to use a calculator on the exams. Specific 
items on two different multiple-choice examinations employed as 
the regular part of the course requirements were used to measure 
student progress in comprehending statistical concepts. The two 
exams differed in both length and comprehensive nature: Exam #3 
(75 questions) and the Final Exam (100 questions). Kuder 
Richardson 21 estimates of reliability over several past exam 
administrations for subjects averaged .83. 

Descriptive statistics for total scores on the two exams are 
provided in Table 1. Each exam included a different array of 
statistical test items: Exam #3: 20 items (27% of total exam 
items on this test) and Final Exam: 13 items (13% of total exam 
items on this test) . Students at the Final Exam had the 
advantage of already taken similar test items on this content and 
clarifying mistakes made on Exam #3. All of the multiple-choice 
items used to evaluate the statistical concepts reported in this 
report were written by the course instructor/investigator. 

It deserves to be noted that students were learning very 
basic statistical concepts such as: measures of central tendency, 
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characteristics of the normal distribution, reliability and 
validity, correlation coefficients, linear and non-linear 
relationships among variables, scatterplots , and test-retest 
reliability. The course content did not address inferential 
statistics or hypothesis testing. 

Procedure 

Item analysis technigues were used to examine student 
responses and determine problematic areas within the course 
content. Descriptive statistics (cumulative % of students 
marking particular options) were used to identify problematic 
concepts and relationships that were resistent to direct teaching 
and student learning. Students were expected to perform at a 
high level on exams due to the mastery nature of the course 
design where (1) all course/exam content was covered in the 
textbook or handouts, (2) all exam content was covered in class, 
and (3) all students were given parallel practice exams that 
included answers and detailed written explanations of the 
answers. In short, students had multiple ways to learn the 
statistical content covered on the exam and they knew exactly 
which statistical content they would confront on the exam. 

Results 

The distribution of student responses for nine exam items (5 
items were from Exam #3 and 4 items were from the Final Exam) are 
depicted in Table 2 which exemplify the following statistical 
learning problems: 
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(1) Students often forget to rank order the scores when 
calculating the median and confuse the mean with the median. 

(2) Students confuse measures of central tendency (mean, median, 
and mode) with measures of variability (range and standard 
deviation) . 

(3) Students have problems understanding that extreme outliers 
in a distribution require the use of the median due to the 
distortion of the mean under these circumstances. 

(4) Students often confuse reliability and validity. They have 
difficulty conceptualizing that a valid measure is also a 
reliable measure; however, a reliable measure may or may not 
be a valid measure. 

(5) Students have considerable difficulty realizing that the 
most optimal test-retest reliability coefficient value must 
be a positive value. A high negative correlation or a 
zero-order coefficient does not infer high reliability. 

(6) Students become confused if asked to select the option with 
the most desirable psychometric characteristic from a choice 
between having high validity or high reliability. 

(7) Students experience difficulty when interpreting correlation 
coefficients in terms of the strongest and weakest 
predictors when both positive and negative values are 
provided. 

(8) Students demonstrate some degree of difficulty approximating 
what a scatterplot would look like for a specific 
correlation coefficient. They often confuse the + and - 
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directionality and degree of relationship elements. 

(9) Students have great difficulty picking out a scatterplot 
which demonstrates high test-retest reliability. 

(10) Students have considerable difficulty understanding that 
correlation coefficients only depict linear relationships 
while scatterplots can represent linear and non-linear 
relationships between variables. 

The proportional loadings (% of students attracted to the 
correct options) for the nine sample items indicated various 
levels of item difficulty (n=9 correct answers; mean:66.33%; 
sd=11.18; range: 52%-83%). The proportional loadings for the 27 
incorrect options (each item had three incorrect options) 
demonstrated that a wide range of subjects were attracted to 
these items (n=27 incorrect answers; mean=10.89%; sd=12.00; 
range: 0%-44%). Many of these loadings on incorrect options were 
small and inconsequential; however, eleven of the 27 loadings 
included 10% or more of the class (as many as 44% in one case) 
selecting an incorrect option. These popular loadings on 
incorrect responses were interpreted as key elements to 
understanding how large numbers of students misunderstood 
statistical concepts or became confused about the issue under 
scrutiny. 

Discussion and Conclusion 

The findings of this study make it clear that proportional 
responses to incorrect answers can offer a useful guide to 
uncovering misconceptions and misunderstandings. It is crucial 
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to make the empirical findings of a relatively large sample of 
students in a course drive the exploration of understanding how 
students conceptualize knowledge and how to make improvements in 
instructional design and evaluation techniques. 

If college instructors need to teach these statistical 
concepts in courses, they may wish to offer students special 
instruction in class highlighting such problematic elements of 
learning elementary statistics. When instructors better 
understand how students think about statistical content, they can 
promote learning, improve instruction, and advance evaluation all 
at the same time based upon detailed descriptive analyses of exam 
results. Professors who teach more advanced statistics courses 
might also wish to make certain that students do not harbor 
confusion in these and other essential concepts before teaching 
more complex subject matter based upon such fundamental ideas. 

Test items written around the ten problematic issues 
identified in this paper are likely to challenge students to 
think critically about these statistical concepts. The process 
of using item analysis results from classroom examinations to 
improve test item writing and class instruction is generalizable 
to all content areas of statistical knowledge. 

This is but a humble beginning in the quest to better 
understand how constructivism and pedagogical content knowledge 
can better inform teaching, learning, and evaluation. 

Undoubtedly, many other conceptual problems baffle students as 
they rapidly attempt absorb course content and think critically 



Statistical Errors Page 12 



about subject matter. Professors should consider using their 
research skills to uncover such problems in student perceptions 
and sharing their findings with a scholarly community devoted to 
understanding student learning and outstanding teaching. Let 
this paper become one small step in promoting teaching as a 
unique form of scholarship, fostering critical thinking among 
students, and striving for excellence in teaching. 
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Table 1 

Descriptive Statistics for Total 



Exam #3 
n= 101 

M= 56.02 (75%) 

Sd= 8.67 
range= 39 

Range of Scores: 

32 (43%) - 71 (95%) 

Question Format: 

75 multiple-choice items 



Exam Results 



Final Examination 

n= 100 

M= 77.35 (77.35%) 
sd = 13.42 
ranae = 58 

Range of Scores: 

39 (39%) - 97 (97%) 

Question Format: 

100 multiple-choice items 
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Table 2 

Empirical Support for Findings Based Upon Question Item Analysis 



Finding #1 Students often forget to raink order the scores when 
calculating the median and confuse the mean with the median. 
Test Item from Exam #3 

What is the median for the following set of scores? 

91, 83, 78, 95, 88, 87, 80 

a. 86 

b. 95 

c. 89 

d. 87 5o/„ 11% 



Blank 1 2 



82% 




3 4 



Finding #2 Students confuse measures of central tendency (mean, 
median, and mode) with measures of variability (range and 
standard deviation). 

Finding #3 Students have problems understanding that extreme 
outliers in a distribution require the use of the median due to 
the distortion of the mean under these circumstances. 

Test Item from Exam #3 

Consider the task of selecting the best measure of central 
tendency for the following scores: 



52, 65, 72, 44, 

Which measure of central 



the 


above data set? 


a. 


mean 


b. 


median 


c. 


mode 


d. 


range 



82, 234, 67, 77, 58, 62 

tendency would most accurately describe 




21 % 
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Finding #4 Students often confuse reliability and validity. 

They have difficulty conceptualizing that a valid measure is also 
a reliable measure; however, a reliable measure may or may not be 
a valid measure. 

Test Item from Exam #3 

Which of the following is true? 

a. Validity is easier to determine and calculate than 
reliability. 

b. A reliable instrument must also be valid. 

c. Validity refers to consistency across testing situations. 

d. Reliability is a necessary, but not sufficient condition for 
validity. 



53% 
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Finding #5 Students have considerable difficulty realizing that 
the most optimal test-retest reliability coefficient value must 
be a positive value. A high negative correlation or a zero-order 
coefficient does not infer high reliability. 

Test Item from Final Exam 

Which of the following correlation coefficients represents the 
best test-retest reliability? 

a. r=+.79 7 Qo/„ 

b . r=+ . 5 8 

c. r=-.06 

d. r=-.89 
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Finding #6 Students become confused if asked to select the 
option with the most desirable psychometric characteristic from a 
choice between having high validity or high reliability. 

Test Item from Final Exam 

Which of the following would be considered the most valuable 
characteristic of a test from a psychometric perspective? 

a. High validity 

b. Low validity 

c. High reliability 

d. Low reliability 



Blank 12 3 4 
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31% 
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Finding #7 Students experience difficulty when interpreting 
correlation coefficients in terms of the strongest and weakest 
predictors when both positive and negative values are provided. 
Test Item from Final Exam 

Which of the following represents the strongest relationship 
between two variables? 

62% 

32% 



a. 


r=+.84 


b. 


r=+.12 


c. 


r=-.35 


d. 


r=-.89 



Blank 1 




Finding #8 Students demonstrate some degree of difficulty 
approximating what a scatterplot would look like for a specific 
correlation coefficient. They often confuse the + and - 
directionality and degree of relationship elements. 

Test Item from Exam #3 

Study the graph below and respond thoughtfully to the question. 



///W/ 



Lolo 






♦ . * 

# ♦ • 



LOls/ 



Hi&M 



Determine the nature of the relationship and estimate the value 
of the correlation coefficient statistic that is represented in 
the graph. Which of the following provides the best description 
of this relationship? 

69% 



a. 


positive relationship 


r=+.98 


b. 


negative relationship 


r=-.53 


c. 


zero-order relationship 


r=+.03 


d. 


positive relationship 


r=+.57 



24% 



2% 4% 
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Finding #9 Students have great difficulty picking out a 
scatterplot which demonstrates high test-retest reliability. 

Test Item from Final Exam 

Which of the following represents the most desirable test-retest 
reliability? 




Blank 1 2 3 4 

Finding #10 Students have considerable difficulty understanding 

that correlation coefficients only depict linear relationships 
while scatterplots^an represent linear and non-linear 
relationships between variables. 

Test Item from Exam #3 
A scatterplot 

a. can only display the relationship between two non-linear 
variables. 

b. refers to a random depiction of the relationship between two 
variables. 

c. can only display the relationship between two linear 
variables. 

d. can depict linear and non-linear relationships between 
variables. 



83% 
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SPECIAL NOTES: The possible test item options labeled a, b, c, 

and d are coded as follows by the computer: 

a=l , b=2 , c=3 , and d=4 . The correct answers are 

denoted by the dark bar graphs. 
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